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Abstract. We present additional tests of our algorithm aimed at filtering out 
systematics due to data reduction and instrumental imperfections in time series 
obtained by ensemble photometry. Signal detection efficiency is demonstrated, 
and a method of decreasing the false alarm probability is presented. Including 
the recently discovered transiting extrasolar planet HAT-P-1, we show various 
examples on the signal reconstruction capability of the method. 



1. Introduction 

One of the most serious challenges in the hunt for transiting extrasolar planets is 
the removal of the various systematics remaining in the photometric databases 
even after employing so phisticated methods of CCD im age reduction. Sometimes 
called as "red noise" (jPont. Zucker &: Queloz Il2006l ). systematics/trends may 



show up in many ways. Their most common appearance is a drift with 
1 d _1 frequency due to variations of the point-spread function and numerous 
other parameters, for example focus change, sub-pixel coordinate drifts, etc. 
In addition to these most trivial systematics, we may have many others, from 
transients (e.g., imperfect removal of cosmic rays) to periodic saturation of bright 
stars, depending on Moon phase. During the past several years it has been 
realized that, except for the simple, high signal-to-noise ratio cases, filtering out 
these systematics from the databases is absolutely crucial from the point of view 
of transit search. Therefore, efforts have been taken to devise post processing 
algorithms that are capable of whitening out the data from the systematics. 
There are t wo methods in this field that att racted wider interestQ. The method 
SysRem bv lTamuz. Mazeh fc ZuckeTI (|2005l 1 uses basically an iterative principal 



component analysis to filter out the most prominent systematics from the data. 
The Trend Filtering Algorithm (TFA) by iKovacs. Bakos. Sz Noves I (|2005l ) is a 
least-squares method that is capable of filtering out nearly arbitrary systematics, 
assuming that the selected set of templates is "flavorous" enough, i.e., it contains 
the light curves necessary for the approximation of the type of trend observed 
in the target. Here we present tests on the signal recovery capability of TFA. 



1 A third method by iKrus zcwski & Sc meniukl ((2003) , that employs straightforward iterative 
Fourier filtering, has been used apparently less frequently. 
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2. The Algorithm 

For completeness, we briefly summarize the basic steps of TFA. First we select 
a template set of M light curves from the photometric database of the field of 
interest. From these {Xj(i);j = l,...,M;i = l,...,iV} time series (sampled in 
N moments of time), we construct the following filter: 

M 

F(i) = Y. c 3 X iii) . (1) 
i=i 

The coefficients {cj} for a target are determined by minimizing the fol- 

lowing expression: 

N 

V = Y,[Y(i) - A(i) - F(t)Y . (2) 

i=l 

Here the function A(i) is derived in the following way: 

a(-\ _ / 00 = const ; for period search , , , 

^ ' \ A(i) 44> Y(i) — F(i) ; for signal reconstruction . ^ ' 

Namely, in the case of period search we assume that the observed signal is 
dominated by systematics and noise, and therefore, the filter is expected to 
yield minimum dispersion around the constant signal average (F)0. Once the 
signal is identified, we can recover its shape by iteratively approximating the 
noiseless and trend-free signal {^4(i)}. (The iteration is indicated by the symbol 
44> in Eq. (3).) In this iterative process we assume that {A(i)} can be represented 
by a low-parameter model, and that the observed signal minus the systematics 
yield the true signal with white noise. 



3. Transit Detection Efficiency 

The datasets used in this paper are listed in Table 1. We test transit detec- 
tion efficiency (TDE), false alarm probability (FAP) and signal reconstruction 
capability (SRC). For testing SRC on data sets different from those of HATNet 
(jBakos et all 12004) ) . we use the HAT-P-1 dBakos et al. I [2006) observations by 



the 60/90/180cm Schmidt telescope of the Konkoly Observatory. 

To show that TFA is capable of filtering out nearly all systematics, we 
compute the distribution function of the peak frequencies obtained by the BLS 
analysis ( Kovacs. Zucker &; Mazehl [2002). The result of this analysis for field 



#125 is shown in Fig. 1. We see that the original data includes many stars with 
various periodic systematics, some of which are not too easy to relate to the 



2 If the observed signal is dominated by the true signal, the above approximation for frequency 
search is still applicable, since the temporal behavior of the systematics is still unlikely to 
be the same as that of the signal. In both the systematics- and signal-dominated cases the 
true signal suffers from some distortion, but in the latter case the distortion will become more 
obvious. 
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daily change of the observational conditions. For example, the strongest peak at 
~ 0.16 d _1 is most probably associated with the saturation of the bright stars, 
because after omitting the first ~ 400 stars, the peak at this frequency becomes 
less prominent than the ones corresponding to other systematics. 



Table 1.: Properties of the test datasets 



Set N iV s tar T[d] I [mag] Purpose 

HATnet #125 6100 24980 141.0 6.3 - 9.9 TDE, FAP 

HATnet #127 2100 13620 167.0 7.6 - 9.7 FAP 

HATnet #148 5000 3430 112.0 7.4- 13.6 SRC 

HAT-P-1, Schmidt 650 202 0.32 9.6 - 16.5 SRC 



Notes: N: number of data points per object; Netax" number of stars in the field; T: total 
time span; I [mag]: I magnitude range in the field. 




Figure 1.: : Distribution of the peak frequencies of the BLS spectra of the first 
2000 brightest stars in HATNet field #125. Left panel: raw (original) data; 
right panel: TFAd data with 900 templates. Please note that the application of 
TFA leads to a nearly flat distribution of the peak frequencies as expected for 
white noise signals. 



We have already demonstrated in Kovacs et al. (2005) the ability of TFA 
of detecting shallow transits that are buried in noise and systematics. Because 
the statistics we use have slightly changed from then, here we show the results 
of tests conducted with the new statistics. 

We inject a periodic transit signal in the given target from the first 2000 
stars of field #125. Then we run a TFA/B LS analysis on the target and check if 
the DSP parameter ( Kovacs Bakos 20051 ) corresponding to the highest peak 
in the frequency spectrum exceeds a given limit. The DSP parameter expresses 
the significance of the dip corresponding to the transit derived from the analysis. 
In order to exclude binaries with light reflection and gravitational effects, DSP 
also includes weighting by the most significant Fourier component of the out- 
of-transit variation. When computing DSP, we always use the TFA code in the 
signal reconstruction mode to get a better estimate on this parameter. 

Properties of the injected signal and the number of detections are listed in 
Table 2. We note that the synthetic signal has a flat out-of-transit part and 
a trapeze shape with rather long ingress/egress durations. The condition of 
detection is given by a cutoff imposed on DSP. The value of this cutoff is large 
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enough to eliminate false alarms (see Sect. 4). We see that there is a highly 
significant increase in the detection probability due to the application of TFA. 
This increase is especially striking when the top 500 bright stars are tested. 
These are the ones that are most seriously affected by systematics related to 
saturation effects (see above). We note that the detection ratio can be slightly 
increased (from 46% to 50%) if we choose templates only from this brighter set 
of stars. 

Table 2.: Detection of injected transit signals in field #125 



TFA 


AW 


A d 


A d [%] 





2000 


972 


49 


1 


2000 


1340 


G7 





500 


51 


10 


1 


500 


228 


4G 



Notes: Analysis: BLS, with minimum transit duration of O.OlPtost, Ptest £ [1-0, 100.0] d; 

Detection condition: DSP> 8.0; Injected signal parameters: period, P — 5.123 d; 
fractional transit length, gtran = Aftran/P = 0.02; fractional ingress length, qingc = 
Atingr/Attran = 0.40; transit depth, d = —0.015 mag. 



4. False Alarms 

By "false alarms" we mean those cases when the detection statistics indicate the 
presence of a signal, but the probability distribution of the statistics (derived 
on pure noise) shows that the observed value may also occur due to a random 
event and the probability of this to happen is "greater than we would like to" . 
In assessing FAP we resort to direct statistical tests, in which we generate pure 
Gaussian time series on the time base of the observed light curves. We analyze 
these artificial time series and count the number of cases when DSP exceeds a 
prescribed limit. In this way we get estimates on FAPs for a given dataset when 
TFA and BLS are used. 

The results of the tests are presented in Table 3. Several conclusions can be 
drawn from this table. First of all, as expected, application of TFA introduces 
correlation in the time series This increases FAP by a substantial amount. 
Second, larger number of data points results in a decrease of FAP. Third, larger 
number of templates leads to stronger correlation, and therefore, to an increase 
of FAP. Fourth, although increasing the number of data points decreases FAP, 
it is present in a relative high value even for higher DSP cutoff values (when 
we already expect a visible sign of the (fake) transit in the folded/binned light 
curvcl- 



3 This "side effect" is unavoidable in any data fitting. In the applications the net result will 
depend on the relative weight of this correlation to the one introduced by the systematics. 

4 It is noted, however, that in many cases the high/moderate DSP detections are due to short 
events containing few data points. 
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In an attempt to reduce FAP, the following method is suggested. By using 
several TFA runs, corresponding to various template numbers, we compute DSP 
values for the given database. Due to the way the template sets are constructed, 
these results are expected to be largely independent from each other. In a 
conservative approach of signal detection, we require the signal to be present in 
all these runs. In the primary selection of transit candidates we require only that 
the dip is negative (i.e., corresponding to dimming) and that DSP > DSP cu t- 
The result of this multiple template FAP filtering is shown in the lower three 
lines of Table 3. We see that the method is very effective already with three 
different TFA runs. For example, even for the sparsely sampled field #127, with 
three or more different TFA runs we can filter out false alarms with a probability 
better than 99.9% for signals with DSP> 7. 



Table 3.: Testing false alarms in fields #125 and #127 



TFA iV d (#127) iV d (#125) 



DSP cut : 


5 


6 


7 


5 


6 


7 





68 


7 





26 


3 


1 


a 


827 


400 


107 


258 


24 


2 


b 


856 


523 


168 


306 


31 


6 


c 


842 


560 


220 


394 


42 


3 


d 


890 


638 


328 


474 


60 


5 


ab 


386 


112 


9 


73 


2 





abc 


184 


38 


1 


25 








abed 


97 


21 


1 


14 









Notes: Test signal: pure Gaussian noise; datasets: the top 2000 bright stars in each field; 

Analysis: BLS, with minimum transit duration of 0.02P tC st, -Ptost £ [1.0,100.0] d; 
TFA template numbers: 0, 700, 800, 900 and 1000 for 0, a, b, c and d, respectively; 
DSP lower detection limits: 5, 6 and 7; Items in the table show the number of 
detections (negative dips with DSP > DSP cu t); ab, abc, abed: datasets used for 
finding simultaneous detections. 



5. Signal reconstruction 

Signal reconstruction is an essential (but optional) part of signal processing when 
TFA is used. This is because we do not know a priori which part of the ob- 
served signal comes from the systematics and which one from the true signal 
(and all these are coupled with noise). Without knowing the signal parameters 
a priori, we resort to an iterative scheme in reconstructing the true signal (see 
also Sect. 2). Our experiments on the HATNet database show that this recon- 
struction can be quite successful without making any assumption on the signal 
shape. Once the signal shape is reliably identified, one can proceed by more 
specific assumptions, e.g., by using trapeze transit shapes, and thereby further 
decreasing the number of parameters fitted. To illustrate the efficiency of the 
TFA reconstruction, we show two examples in Fig. 2. 
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Figure 2.: : Upper panels: ensemble photometry of an eclipsing variable in 
field #148 (left) and that of HAT-P-1 (right); lower panels: TFA-reconstructed 
light curves of the same objects. Headers (from left to right): star ID, number 
of data points, average I magnitude, main BLS frequency [d _1 ], signal-to-noise 
ratio (SNR) of the BLS spectrum, DSP, SNR of the out-of-transit variation and 
its peak frequency in the units of the BLS frequency. On the right we have: star 
ID, average magnitude, plotting frequency [d _1 ], number of data points, DSP. 
The reconstruction of HAT-P-1 was made without using its nearby companion 
star ADS16402 A. In both cases no assumptions were made on the signal shape. 



6. Conclusions 

Filtering out systematics from astrophysical time series is nearly mandatory if 
a survey-type analysis is made with the goal of reaching the theoretical white 
noise limit of signal detection. In the search for extrasolar transiting planets 
this issue becomes even more highlighted due to the delicacy of the detection. 
We have shown in this paper that TFA is capable of filtering out various sys- 
tematics, thereby allowing the detection and a concomitant reconstruction of 
faint regular (e.g., simple- or multi-periodic) signals. Furthermore, by requiring 
multiple detections in time series filtered by various TFA templates, false alarm 
probability can be pushed down near to the white noise limit. 
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